Pareto Local Policy Search for MOMDP Planning

نویسندگان

Chiel Kooijman

Maarten de Waard

Maarten Inja

Diederik M. Roijers

Shimon Whiteson

چکیده

Standard single-objective methods such as value iteration are not applicable to multi-objective Markov decision processes (MOMDPs) because they depend on a maximization, which is not defined if the rewards are multi-dimensional. As a result, special multi-objective algorithms are needed to find a set of policies that contains all optimal trade-offs between objectives, i.e. a set of Pareto optimal policies. In this paper, we propose Pareto local policy search (PLoPS), a new planning method for MOMDPs based on Pareto local search (PLS) [3]. This method produces a good set of policies by iteratively scanning the neighbourhood of locally non-dominated policies for improvements. It is fast because neighbouring policies can be quickly identified as improvements, and their values can be computed incrementally. We test the performance of PLoPS on several MOMDP benchmarks, and compare it to popular decision-theoretic and evolutionary alternatives. The results show that PLoPS outperforms the alternatives.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Archive TOULOUSE Archive Ouverte ( OATAO )

This study discusses the application of sequential decision making under uncertainty and mixed observability in a mixed-initiative robotic target search application. In such a robotic mission, two agents, a ground robot and a human operator, must collaborate to reach a common goal using, each in turn, their recognized skills. The originality of the work relies in considering that the human oper...

متن کامل

Approximation of Lorenz-Optimal Solutions in Multiobjective Markov Decision Processes

This paper is devoted to fair optimization in Multiobjective Markov Decision Processes (MOMDPs). A MOMDP is an extension of the MDP model for planning under uncertainty while trying to optimize several reward functions simultaneously. This applies to multiagent problems when rewards define individual utility functions, or in multicriteria problems when rewards refer to different features. In th...

متن کامل

Pareto Adaptive Decomposition algorithm

Dealing with multi-objective combinatorial optimization and local search, this article proposes a new multi-objective meta-heuristic named Pareto Adaptive Decomposition algorithm (PAD). Combining ideas from decomposition methods, two phase algorithms and multi-armed bandit, PAD provides a 2-phase modular framework for finding an approximation of the Pareto front. The first phase decomposes the ...

متن کامل

Towards a MOMDP model for UAV safe path planning in urban environment

This paper tackles a problem of UAV safe path planning in an urban environment in which UAV is at risks of GPS signal occlusion and obstacle collision. The key idea is to perform the UAV path planning along with its navigation and guidance mode planning, where each of these modes uses different sensors whose availability and performance are environment-dependent. A partial knowledge on the envi...

متن کامل

On Universal Search Strategies for Multi-Criteria Optimization

We develop a stochastic local search algorithm for finding Pareto points for multi-criteria optimization problems. The algorithm alternates between different single-criterium optimization problems characterized by weight vectors. The policy for switching between different weights is an adaptation of the universal restart strategy defined by [LSZ93] in the context of Las Vegas algorithms. We dem...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2015

Pareto Local Policy Search for MOMDP Planning

نویسندگان

چکیده

منابع مشابه

Archive TOULOUSE Archive Ouverte ( OATAO )

Approximation of Lorenz-Optimal Solutions in Multiobjective Markov Decision Processes

Pareto Adaptive Decomposition algorithm

Towards a MOMDP model for UAV safe path planning in urban environment

On Universal Search Strategies for Multi-Criteria Optimization

عنوان ژورنال:

اشتراک گذاری